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1 .   Introduction 

In  the  statistical  literature,  three  basic  principles  are  available 
for  hypothesis  testing,  namely  the  likelihood  ratio  (LR) ,  Wald  (W)  and 
Lagrange  multiplier  (LM)  (or  score)  principles.   Their  asymptotic 
equivalence  under  the  null  hypothesis  and  under  local  alternatives  is 
well  known.   The  purpose  of  this  paper  is  to  examine  the  additivity  and 
separability  properties  of  these  tests. 

Additivity  focuses  on  the  optimal  way  of  combining  tests  of  different 
hypotheses  and  indicates  a  joint  test  statistic  can  sometimes  be  obtained 
by  adding  up  the  component  statistics.   Alternatively,  rather  than 
applying  a  joint  test,  the  individual  tests  can  be  applied  separately 
and  the  overall  significance  level  can  be  calculated.   An  interesting  feature 
of  LM  statistics  is  that  they  are  sometimes  additive,  that  is  the  LM  test 
for  testing  a  joint  hypothesis  is  the  sum  of  LM  statistics  testing  the 
components  of  the  null  hypothesis  separately.   This  kind  of  additivity  was 
first  noted  by  Pesaran  (1979).   He  found  that  the  LM  test  for  testing 
the  dynamic  specification  of  the  deterministic   and  stochastic   parts  (of 
the  linear  regression  model)  simultaneously  can  be  decomposed  into  two 
independent  parts.   This  holds  even  for  more  complicated  cases;   for 
example,  the  tests  developed  in  Bera  and  Jarque  (1982)  for  different 
combinations  of  normality  (N) ,  homoscedasticity  (H) ,  serial  independence 
(1)  of  the  regression  disturbances  and  functional  form  (F)  were  found  to 
be  additive.   There  are  some  cases  where  additivity  fails,  e.g.,  if  a 
lagged  dependent  variable  is  introduced  into  the  Bera  and  Jarque  (1982) 
framework,  the  tests  will  not  be  additive  nor  can  the  LM  test  derived  in 
Jarque  and  Bera  (1982)  for  testing   H  :  u  ^  NH   against  non-normality  (N) 
and  heteroscedast ic i ty  (H) ,  where   u   is  the  disturbance  term  in  a  limited 
dependent  variable  model,  be  decomposed  into  independent  parts.   In  this 
paper,  we  provide  the  necessary  and  sufficient  conditions  for  LM  tests  to 
be  additive  in  this  sense  and  also  examine  the  additivity  properties  >>:"  the 


-  2  - 

Aitchison  (1962)  introduced  the  concept  of  separability  which  is 
a  useful  piece  of  information  because  it  may  mean  that  For  large  samples  the 
computations  required  for  hypothesis  testing  can  be  considerably  reduced.   If  two 
hypotheses  are  separable  and  the  sample  is  large,  while  testing  one  hvpothesis 
we  may  be  able  to  ignore  the  other  hypothesis,  that  is,  the  test  is  robust 
to  whether  the  other  hypothesis  is  true  or  not.   We  relate  the  concept 
of  separability  to  additivity  in  the  context  of  these  three  testing 
principles . 

2.   Additivity  and  Separability 

Let   £. (6)   denote  the  log-density  function  for  the   ith   observation, 

where   9   is  a   px  1   parameter  vector.   Say  we  have   N   independent 

N 
observations,  then  the  log-likelihood  function  is  I    =    1(d)    =  E .    ,  £.(9). 

°  i=l   l 

Assume  the  hypothesis  to  be  tested  is   H  :  h(9)  =  0  where   h(G)   is  an 

o 

r*  1  vector  function  of  9  and  it  is  assumed  that  H  =  H(9)  =  3h(9)/99 
has  full  column  rank  i.e.  rank(H)  =  r.  The  LM  statistic  can  be  written 
as  [see  Breusch  and  Pagan  (1980,  p. 240)] 

LM  =  d'i_1d  =  X'H'i-1HX  (1) 

where   d  =  d(9)  =  9Z/39   is  the  efficient  score  vector,   I  =    1(6)  =  E(~3  11/3839*1 

is  the  information  matrix  ,  \      are  the  Lagrangian  multipliers 

satisfying  the  equation   d  +  HA  =  0   and   """"   indicates  the  quantities  have 

been  evaluated  at  the  restricted  maximum  likelihood  estimate  of   9   say 

If  we  partition   H    into   H  :  h,  (6)  =  0    H  :  h?(0)  =  0  with  a 

similar  partition  for   H  =  [H  :  H-],   then  the  LM  test  for  testing   H   will 

be  additive  between  the  two  hypotheses   HA   and   H„   iff   II ' I   H   is  block 

'  v  A        B 


( 


( 


As  pointed  out  in  Davidson  &  McKinnon  (1983)  and  Bera  &  McKenzie  (1984) 
a  number  of  alternative  asymptotically  equivalent  forms  o\    the  information 
matrix  are  available.   For  these  alternative  forms,  additivity  will  on\w 
be  asymptotic . 
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diagonal  between  the  Lagrange  multipliers  for  the  two  hypotheses,  that 

is  h: i_1h9  =  0. 

If  we  partition  the  parameter  vector   0   into   9  =  (6 !  , 6  A)  '   and 
consider  testing  hypotheses  of  the  form   H  :  89  =  89   then  the  LM  test 

for  testing   H   will  be  additive  between  all  the  individual  hypotheses  if 

~22 

I     is  block  diagonal  between  the  testing  parameters  of  the  two  hypotheses, 

-22  ~ - 1 

where   I     refers  to  the  (2,2)  block  of   I   .   This  can  easily  be  seen  by 

writing  the  LM  test  as  [see  Breusch  and  Pagan  (1980,  p. 241)] 

lm  =  d;[i22-i21i^i12]_1d2 

-  '-22- 

=  d9I   d9   (say) 


wh 


ere   d9  =  3?.(6)/362   and   I    =  E 


r  r   3  & 


1 


36  . 


f  34. -\  '-] 


=  i  i 


k 

Obviously,  the  necessary  and  sufficient  condition  for  additivity  is  the 

~22 
block  diagonality  of   I   .   Under   H    and  appropriate  regularity  conditions 

[see  e.g.  Serfling  (1980,  p. 144-5)],   3£/369   asymptotically  follows  a 

multivariate  normal  distribution  with  mean  zero  and  variance-covar iance 

matrix   I00-I01 I. . I. n .      Therefore,  the  block  diagonalitv  of   Ino-T 01 I . , I . ~ , 
22   21  11  12  o       .       22      21  il  12 

i.e.  zero  covariance  between  the  two  components  of   3£/389   corresponding 
to  the  two  hypotheses,  implies  asymptotic  independence  of  the  different 
components  of   32./309   and  the  LM  test  is  based  on  this  vector. 

Additivity  of  the  LM  test  can  easily  be  related  to  Aitchison's  (1962) 
concept  of    separability.   In  our  example,  separability  means  ">:  lar-j 
sanples,     tests  of   H  :  h^O)  =  0   against   H:  h,(0)  f   O|h2(0)  -  0   and 


1!(|:  h9(0)  =  0   against   H:  h9(.)  4-   0|h.(6)  =  0   use  the  same 


■  critical  regions 


as  tests  oi      H  ■  h.(0)  =  0   against   H:  h,(9)  i    0   and   H  :  h9(8)  =  ^ 
li'ai:,:U   H:  l^V')  4   0   respectively.   Aitchison  (1962)  provided  a  sufficient 
condition  for  two  hypotheses  to  be  separable  with  respect  to  all     '<'.■  thrr, 


—  -t  — 


Statistics    -    the  LR,  \<    and  LM.   Suppose  we  want  to  examine  whether 

H  :  h,(0)  =  0   and   H  :  ho(0)  =  0   are  separable.   A  sufficient  condition 

is  that   HJl~  H?  =  0   for  all   0   satisfying   11,(0)  =  0   and   h9(9)  =  0. 

2 
This  condition  is  identical   to  the  necessary  and  sufficient  condition  for 

the  LM  tests  to  be  additive.   Therefore  additivity  of  the  LM  test  implies 

separability  of  the  LM  test  and  since  Aitchison's  result  applies  to  all  the 

three  test  principles  this  also  implies  separability  with  respect  to  the  LR 

and  W  tests. 

Given  the  additivity  of  the  LM  test,  it  is  interesting  to  investigate 

whether  the  LR  and  W  tests  share  this  propertv.   Let  l.n.    2. ,  — ,  l-n      and  "•.-- 

v      v  AB    AB    AB        AB 

be  the  log-likelihood  values  when  both   H.   and   HD   restrictions,  onlv 

Ad 

H,   restrictions,  onlv   H    restrictions  and  no  restrictions  are  imposed 
A  -    B  ' 

respectivelv .   If   LR.D   is  the  joint  LR  test  of  both  restrictions,   LR. 

Ad  n 

the  LR  test  of   H,   restrictions  and   LR„   the  LR  test  of   Hn   restriction.; 

A  D  D 

then 

LRAB  =  2|?AB  -  W 

LRA  "  2[1AB  "  W 

B  ^lxAB  "ABJ  ' 
Now 

LRAD  =  LR,  +  LRt, 
AB      A      B 


iff 


I —  =    I-      +    I    -   -   I 
AB    \\B     AB    XAB 


In  genera],  the  above  relation  is  not  true  in  either  finite  or  large  sampL 


Strictly  speaking  the  conditions  are  not  identical.   For  additivity  wt 
need   Hjl~  H0  =  0   which  is  implied  by   H'l"  H0  =  0   for  all   6   satisfying 
h(9)  =  0,   but  not  vice-versa.   However,  without  loss  of  generality,  we 
can  assume  that  the  parameter  space  over  which   H,l   11^,  =  0   has  measure 
zero.   Then  we  can  sav  that  the  conditions  are  almost  surely  identical. 
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But  if  we  rewrite  it  as 

[  \\B  "  S\BJ  =  ^AB    £AB^ 

i 

i.e.,         LRAg  =  LRA 

where   LR  -   is  the  test  of  the   H    restriction  without  imposing  the   H.. 

A  D  A  D 

a  a 

restrictions.   Separability  implies   LR  —  =  LR  ,   where   =   denotes 

AB      A 

asymptotic  equivalence.   Therefore  separability  of  the  LR  test  implies 
the  LR  test  will  be  additive  in  an  asymptotic  sense. 

Turning  to  the  question  of  the  additivity  of  W  it  is  easy  to  show 
that,  given  separability,  a  sufficient  condition  for  W  to  be  additive  is 
that   H'l   H?  =  0  where   """   indicates  the  quantities  have  been  evaluated 
at  the  unrestricted  maximum  likelihood  estimate  of   9   say   0. 

wAB   =   h(9),[H,I~1H]"1h(e) 

=  h1(e),(H^i"1H1)"1h1(6)  +h9(9),(H*i"1H2)h2(e) 

given   H:i_1H2  =  0 
=  hL(b) ' (H[i"1H1)"1h1(9)  +  h2(8) * (H2l"1H2)h2(6) 

by  separability 

■  WA  +  WB 

where   "•"   and        denote  the  quantities  have  been  evaluated  at  the 
restricted  maximum  likelihood  estimates  with  the  restrictions   fu(0)  =  0 
and   h.(9)  =0   imposed  respectively.   1 f  we  partition  the  parameter  vector 
9   as  berore  and  consider  testing  restrictions  of  the  form   H  :  3~  =  ^0, 

where   9„   is  i  vector  of  fixed  constants,  then  the  W  test  will  be  additive 

«■  22 

it   I"    is  block,  diagonal  with  respect  to  the  testing  parameters,  where 

-12  ~-l 

I     denotes  the  (2,2)  block  of   1   .   In  the  next  section,  we  provide  some 

examples  o\     the  additivity  and  non-add i t ivi ty  o(    the  LR,  W  and  l.M  tests. 
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3.   Some  Examples 

Consider  the  following  linear  regression  problem 

y  .  =  X !  0  +  u .  i  -  1 , 2 ,  .  .  .  ,  N 

'i     l     i 

where   X.   is  a   kx  1   vector  representing  the   itn   observation  on   k 
l 

fixed  regressors,   8   is  a   kx  1   vector  of  fixed  unknown  parameters,   u. 
are  assumed  to  be  serially  correlated   (I)   and  generated  by  a  first  order 


autoregressive  (AR)  process   u.  =pu.  ,+e.,    p   <  1   where  e.      are  assumed 

l      l-l    i'   '  '  l 

to  be  normally  and  independently  distributed  but  heteroscedas t ic   (H)   with 

2     2 
the  form   V(e.)  =  aT  =  a  +a'Z.   where   Z.   is  an   ix  l   vector  representing 
1111  ' 


:he   itn   observation  on  I      fixed  variables  and   a   an  Z  ■<   1 


rector  ot 


fixed  unknown  parameters.   1 f  we  let   H  :  u  ^  HI   and  denote   LM,,T ,  LM,, 

o  III     H 

and   LMT   to  be  the  LM  statistics  for  testing   H    against   H:  u  ^   HI , 
I  °    o 

H:  u  ^  HI   and   H:  u  ^  HI   respectively  then  Bera  and  Jarque  (1982)  have 
shown  that   LM^   =  LM  + LM  .   Our  results  indicate  that  the  LR  test  will 
also  be  additive.   For  this  example,  when   u  ^  NHI   we  have  [calculated  from 
the  derivatives  (A.2)-(A.5)  given  in  Appendix  A] 


X.X! 
l  i 

i  "  2 

a  . 
i 


0 


0 


0 


HZ; 


2  L 


a  . 

i 


v  a  . 

i 


hi 


1  V 
2  '- 


0 


fz!l 

i 

"  4  I 

c  .  ; 

i 

z .  z : 
i  i 


0 


0 


0 


2i 


N(l+  I  P"J) 


1      J 


vn 


t>rt'   X.  =  (X  .  -cX  .  _.  )  /a  .  .   From  the  above  expression,  it  is  easily  seen 


th.it   I    =  O   so  that  W  also  will  be  additive  asymptotically 
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If  the  regressor  set  includes  a  lagged  dependent  variable,  say 

v   ,,   then   I0  t   0,   but  the  information  matrix  is  still  block  diagonal 
■  i-1  oo 

2 
between   (8,p)   and   (a  ,a)   so  that  the  inverse  will  also  be  block 

diagonal.   Hence  the  tests  for  heteroscedasticity  and  serial  correlation  will 

3 
still  be  additive  .   The  introduction  of  a  lagged  dependent  variable,   >'•_]' 

into   Z   will  not  alter  the  structure  of  the  information  matrix  nor 
i 

invalidate  the  additivity  result. 

The  assumption  of  normality  or,  more  importantly,  the  assumption  that 

3 
E(e.)  =  0   is  however  critical.   If  this  assumption  is  relaxed   then 
l 

2 
I   ,,  I.   ^  0   but  block  diagonality  between   (3,o  ,a)   and   p   holds. 

„  2    Ba 
so 

If,  in  addition,   v.  .   is  introduced  into  the  regressor  set  then   I_ 

'  l-l  Bp 

is  also  non-zero  and  b.ock  diagonality  is  lost.   Similarly  if,  instead  of 

appearing  in  the  regressor  set,   v.  -,   appears  in   Z.   then   I     is  also 

J  i-l  i         ap 

h 
non  zero  and  block  diagonality  is  lost  .   In  both  these  cases,  additivity 

no  longer  holds. 

From  the  previous  example  with  only  fixed  regressors  we  can  see  that 
if   h1(0):  RB  «  0,  h2(0):  a  =  0   or   11.(9):  R8  =  0,  h2(0):  p  -  0   that 
these  hypotheses  will  be  additive  and  separable  since   H'l   H„  =  0  \/  3 . 
This  implies  that,  under  normality,  if   the   sample   is    large   while  testing 
the  restrictions   RS  =  0   we  can  ignore  the  presence  of  autocorrelation 
or  heteroscedasticity  .  Also  the  different  test  statistics  can  simply  be 
added  to  form  a  joint  test.   This  additivity  will  disappear  for  the  first 
hypothesis  it  the  regressor  set  includes   y.  ,   and  in  the  second  case  if 


E(e3) 

i 

t   0. 

? 
For 

any 

vi-,-  j  ' 

1,1,   - • o . 
13  p 

i, 

Block  diagonality  is  not  Lost  if  the  lagged  dependent  variable  appearing 

in  the  regressor  set   or   Z.   is   v.  .,  i  >  I. 

i      -  i- j 

For  example,  Phillips  and  McCabe  (1983)  have  shown  the  Independence  of  the 
common  testis  for  stability  ol   --gression  coefficients  (linear  restriction) 

■>»■■■■■»■■!■■!■■ ■■■■ 1IMI  Mil  ■ II ,|,.^~~i-M.W_n-^,.M~-_WMM 
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In  the  following  example,  the  LM  and  LR  tests  are  additive  but  the 
W  test  is  not  necessarily.   Consider  testing   u  "^  N(0,1)   in  the  following 
framework 

d£i(c1,c?)        cl_ui 


du.  .  2 

1  l-c,u.+c~u. 

1  l   2  l 


whe 


re  one  would  test   H  :  c,  =  cn  =  0.   It  is  shown  in  Bera  and  Jarque 

o    1     1  n 


(1981)  that   LM     =  LM   + LM    so  that  LR  will  also  be  additive. 
C1C2     °1     C2 

However   I      will  not  in  general  be  zero  so  that  W  nay    not  be  additive. 

The  last  example  is  the  case  where  none  of  the  three  tests  are 

additive.   This  is  the  case  of  testing  the  null  hypothesis  that  the 

disturbance  term  in  a  limited  dependent  variable  model  is  normally  distributed 

and  homoscedastic  against  the  alternative  hypothesis  of  non-normality  and 

heteroscedastici ty  [see  Jarque  and  Bera  (1982)]. 

4.   Conclusion 

Additivity  of  the  LM  test  implies  asymptotic  additivity  of  the  LR 
test  but  not  in  general  additivity  of  the  W  test.   This  shows  another 
computational  advantage  of  the  LM  test.   After  carrying  out  one-direct iona i 
LM  tests,  a  joint  test  can  be  obtained  when  additivity  applies  simply  by 
adding  up  the  component  statistics  or  a  number  of  test  statistics  can  be 
combined  to  form  a  joint  test.   For  the  LR  (and  sometimes  for  W)  tests 
such  an  operation  is  valid  only  for  large  samples.   Here  we  should  also 
mention  Lh.it  since  all  three  statistics  are  asymptotically  equivalent  under 
the  null  hypothesis  and  for  local  alternatives,  additivity  of  the  LM  test 
implies  asymptotic  additivity  o\    both  the  W  and  LR  tests  under  the  null 
hypothesis  and  for  local  alternatives. 
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Pagan  and  Hall  (1933)  claim  that  a  major  disadvantage  of  examining 
the  additivity  properties  through  the  information  matrix  is  that  the 
calculation  of  the  information  matrix  is  dependent  on  certain  distributional 
assumptions,  e.g.  symmetry  of  the  disturbances,  and  that  additivity  of 
the  tests  may  merely  reflect  this  fact.   One  of  our  examples  in  section  3 
illustrated  the  importance  of  the  distributional  assumptions  and  that 
account  can  be  taken  of  them  in  the  information  matrix  based  approach. 
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APPENDIX 

For  our  model   6  =  (3'. a  ,a'  p) ' ,   H  :  a  =  0,  p  =  0   and  the  log- 

o 

likelihood  function  l(Q)      is  given  by 

N  N  N   e2 

U9)  =  Z      I.  (9)  =  -(N/2)  Jin  2tt  -  h    I     In  o .  -  h    Z     ~  (A.l) 

i=l  1  1-1    X    i=l  o". 

i 

2    2 
where   a.  =  a  +  a'Z.  and  e.    =  u.  -pu.  .   with   u.  =  y.  -X|8.   From 
l  i        11     l-l  l    Ji        i 

the  above  equation  following  first  order  derivatives  are  easily  obtained 
9^(6)    j 

IB- "~2  (xi-pXi-i)ei  (A'2) 

o  . 

1 

3*.  (6)      .     e2 

—  = J  +  —  (A-J) 

3a        2a.    2a. 
i      l 

311.(0)      Z.     e2Z. 

-T—  ---iT  +  -i7i  (A.4) 

3a       „  2    „  4 

2a .    2a . 
l     l 


3£  (9 ) 

and  — - =  \   (y    -X!  1B)e,  .  (A. 5) 

3p       2   y l-l    l-l    l 

a . 

i 

Taking  cross-products  of  the  derivatives  and  then  taking  expectations, 
we  obtain  the  information  matrix. 
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